10 research outputs found
Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes
Multi-hop Question Answering (QA) is a challenging task since it requires an
accurate aggregation of information from multiple context paragraphs and a
thorough understanding of the underlying reasoning chains. Recent work in
multi-hop QA has shown that performance can be boosted by first decomposing the
questions into simpler, single-hop questions. In this paper, we explore one
additional utility of the multi-hop decomposition from the perspective of
explainable NLP: to create explanation by probing a neural QA model with them.
We hypothesize that in doing so, users will be better able to construct a
mental model of when the underlying QA system will give the correct answer.
Through human participant studies, we verify that exposing the decomposition
probes and answers to the probes to users can increase their ability to predict
system performance on a question instance basis. We show that decomposition is
an effective form of probing QA systems as well as a promising approach to
explanation generation. In-depth analyses show the need for improvements in
decomposition systems
Interpreting Neural Networks for and with Natural Language
In the past decade, natural language processing (NLP) systems have come to be built almost exclusively on a backbone of large neural models. As the landscape of feasible tasks has widened due to the capabilities of these models, the space of applications has also widened to include subfields with real-world consequences, such as fact-checking, fake news detection, and medical decision support. The increasing size and nonlinearity of these models results in an opacity that hinders efforts by machine learning practitioners and lay-users alike to understand their internals and derive meaning or trust from their predictions.
The fields of explainable artificial intelligence (XAI) and more specifically explainable NLP (ExNLP) have emerged as an active area for remedying this opacity and for ensuring models' reliability and trustworthiness in high-stakes scenarios, by providing textual explanations meaningful to human users. Models that produce justifications for their individual predictions can be inspected for the purposes of debugging, quantifying bias and fairness, understanding model behavior, and ascertaining robustness and privacy. Textual explanation is a predominant form of explanation in machine learning datasets regardless of task modality. As such, this dissertation covers both explaining tasks with natural language and explaining natural language tasks.
In this dissertation, I propose test suites for evaluating the quality of model explanations under two definitions of meaning: faithfulness and human acceptability. I use these evaluation methods to investigate the utility of two explanation forms and three model architectures. I finally propose two methods to improve explanation quality– one which increases the likelihood of faithful highlight explanations and one which improves the human acceptability of free-text explanations. This work strives to increase the likelihood of positive use and outcomes when AI systems are deployed in practice.Ph.D
Explainable Prediction of Medical Codes from Clinical Text
Clinical notes are text documents that are created by clinicians for each
patient encounter. They are typically accompanied by medical codes, which
describe the diagnosis and treatment. Annotating these codes is labor intensive
and error prone; furthermore, the connection between the codes and the text is
not annotated, obscuring the reasons and details behind specific diagnoses and
treatments. We present an attentional convolutional network that predicts
medical codes from clinical text. Our method aggregates information across the
document using a convolutional neural network, and uses an attention mechanism
to select the most relevant segments for each of the thousands of possible
codes. The method is accurate, achieving precision@8 of 0.71 and a Micro-F1 of
0.54, which are both better than the prior state of the art. Furthermore,
through an interpretability evaluation by a physician, we show that the
attention mechanism identifies meaningful explanations for each code assignmentComment: NAACL 201
Reframing Human-AI Collaboration for Generating Free-Text Explanations
Large language models are increasingly capable of generating fluent-appearing
text with relatively little task-specific supervision. But can these models
accurately explain classification decisions? We consider the task of generating
free-text explanations using human-written examples in a few-shot manner. We
find that (1) authoring higher quality prompts results in higher quality
generations; and (2) surprisingly, in a head-to-head comparison, crowdworkers
often prefer explanations generated by GPT-3 to crowdsourced explanations in
existing datasets. Our human studies also show, however, that while models
often produce factual, grammatical, and sufficient explanations, they have room
to improve along axes such as providing novel information and supporting the
label. We create a pipeline that combines GPT-3 with a supervised filter that
incorporates binary acceptability judgments from humans in the loop. Despite
the intrinsic subjectivity of acceptability judgments, we demonstrate that
acceptability is partially correlated with various fine-grained attributes of
explanations. Our approach is able to consistently filter GPT-3-generated
explanations deemed acceptable by humans.Comment: NAACL 2022 Camera-ready. 13 pages main + references, 14 pages
appendi
Editing Commonsense Knowledge in GPT
Memory editing methods for updating encyclopedic knowledge in transformers
have received increasing attention for their efficacy, specificity, and
generalization advantages. However, it remains unclear if such methods can be
adapted for the more nuanced domain of commonsense knowledge. We propose
, an adaptation of MEMIT to edit commonsense mistakes in GPT-2
Large and XL. We extend editing to various token locations and employ a robust
layer selection strategy. Models edited by outperforms the
fine-tuning baselines by 10.97% and 10.73% F1 scores on subsets of PEP3k and
20Q. We further propose a novel evaluation dataset, MEMIT-CSK-PROBE, that
contains unaffected neighborhood, affected neighborhood, affected paraphrase,
and affected reasoning challenges. demonstrates favorable
semantic generalization, outperforming fine-tuning baselines by 13.72% and
5.57% overall scores on MEMIT-CSK-PROBE. These results suggest a compelling
future direction of incorporating context-specific user feedback concerning
commonsense in GPT by direct model editing, rectifying and customizing model
behaviors via human-in-the-loop systems.Comment: Code and data is available at https://github.com/anshitag/memit_cs
Self-Refine: Iterative Refinement with Self-Feedback
Like people, LLMs do not always generate the best text for a given generation
problem on their first try (e.g., summaries, answers, explanations). Just as
people then refine their text, we introduce SELF-REFINE, a framework for
similarly improving initial outputs from LLMs through iterative feedback and
refinement. The main idea is to generate an output using an LLM, then allow the
same model to provide multi-aspect feedback for its own output; finally, the
same model refines its previously generated output given its own feedback.
Unlike earlier work, our iterative refinement framework does not require
supervised training data or reinforcement learning, and works with a single
LLM. We experiment with 7 diverse tasks, ranging from review rewriting to math
reasoning, demonstrating that our approach outperforms direct generation. In
all tasks, outputs generated with SELF-REFINE are preferred by humans and by
automated metrics over those generated directly with GPT-3.5 and GPT-4,
improving on average by absolute 20% across tasks.Comment: Code, data, and demo at https://selfrefine.info
BCL11A Haploinsufficiency Causes an Intellectual Disability Syndrome and Dysregulates Transcription
Intellectual disability (ID) is a common condition with considerable genetic heterogeneity. Next-generation sequencing of large cohorts has identified an increasing number of genes implicated in ID, but their roles in neurodevelopment remain largely unexplored. Here we report an ID syndrome caused by de novo heterozygous missense, nonsense, and frameshift mutations in BCL11A, encoding a transcription factor that is a putative member of the BAF swi/snf chromatin-remodeling complex. Using a comprehensive integrated approach to ID disease modeling, involving human cellular analyses coupled to mouse behavioral, neuroanatomical, and molecular phenotyping, we provide multiple lines of functional evidence for phenotypic effects. The etiological missense variants cluster in the amino-terminal region of human BCL11A, and we demonstrate that they all disrupt its localization, dimerization, and transcriptional regulatory activity, consistent with a loss of function. We show that Bcl11a haploinsufficiency in mice causes impaired cognition, abnormal social behavior, and microcephaly in accordance with the human phenotype. Furthermore, we identify shared aberrant transcriptional profiles in the cortex and hippocampus of these mouse models. Thus, our work implicates BCL11A haploinsufficiency in neurodevelopmental disorders and defines additional targets regulated by this gene, with broad relevance for our understanding of ID and related syndromes.This article is available via Open Access. Click on the Additional Link above to access the full-text via the publisher's site.Wellcome Trust (grant number WT098051)Published (open access
Inferring the Reader: Guiding Automated Story Generation with Commonsense Reasoning
Transformer-based language model approaches to automated story generation
currently provide state-of-the-art results. However, they still suffer from
plot incoherence when generating narratives over time, and critically lack
basic commonsense reasoning. Furthermore, existing methods generally focus only
on single-character stories, or fail to track characters at all. To improve the
coherence of generated narratives and to expand the scope of character-centric
narrative generation, we introduce Commonsense-inference Augmented neural
StoryTelling (CAST), a framework for introducing commonsense reasoning into the
generation process with the option to model the interaction between multiple
characters. We find that our CAST method produces significantly more coherent,
on-topic, enjoyable, and fluent stories than existing models in both the
single-character and two-character settings in three storytelling domains